WHAT ISCLATMRD TS: 

1 1 . A method for clustering a plurality of items, each of the items including 

2 information, guided toward an initial organization structure, the method comprising; 

3 inputting a plurality of items, each of the items including information, into a 

4 clustering process; 

5 inputting an initial organization structure into the clustering process, the initial 

6 organization structure including one or more categories, at least one of the categories being 

7 associated with one of the items; 

8 processing using at least processing hardware the plurality of items based upon at 

9 least the initial organization structure and the information in each of the items in at least the 
XP clustering process; 

O determining a resulting organization structure based upon the processing, the 

(5 resulting organization structure more closely resembling the initial organization structure than if 

I| an empty organization structure or an alternative initial organization structure had been input into 

© the clustering process; and 

|5 storing the resulting organization structure in memory. 

El : 

; "4 2. The method of claim 1 wherein the processing comprises determining a 

J2 likeness level between a first item and a second item, the likeness level between two items 

2 increased if they are both similar to items in one or more of the categories of the initial 

4 organization structure. 

1 3. The method of claim 2 wherein the determining the likeness level between 

2 the first item and the second item comprising: 

3 associating a first feature vector with the first item and a second feature vector 

4 with the second item, each feature vector representing information associated with each item; 

5 adding a first additional feature and a second additional feature to the first feature 

6 vector and the second feature vector of the first item and the second item, respectively, the first 

7 additional feature representing a first category of the initial organization structure and the second 

8 additional feature representing a second category of the initial organization structure, the first 



27 



9 additional feature providing a degree to which the first item is similar to one or more items in the 

10 first category of the initial organization structure; and 

1 1 calculating a degree of similarity of the first item and the second item including 

12 calculating a similarity measure between the first additional feature and the second additional 

13 feature, 

1 4. The method of claim 1 wherein the resulting organization structure 

2 includes a portion of the initial organization structure and at least one additional category 

3 coupled to the initial organization structure. 

1 5. The method of claim 1 wherein the resulting organization structure relates 

2 to the initial organization structure based upon a category similarity. 

iL.:,. 

6. The method of claim 1 wherein the resulting organization structure relates 

h2 to the initial organization structure based upon a similarity of hierarchy structure. 

Si 

?A 7. The method of claim 1 wherein the item is a document, a product, a 

42 person, a DNA sequence, a purchase transaction, a financial record, or a species description. 

Ill 8. The method of claims 1 further comprising outputting the resulting 

^3 organization structure on an output device. 

it-,.!; 

H 9. The method of claim 1 wherein the processing hardware uses at least a 

2 500 MHz clock to efficiently run the clustering process. 

1 1 0. The method of claim 1 wherein the plurality of items includes at least 

2 10,000 items. 

1 1 1 . A computer aided information organization device, the device including 

2 one or more computer memories, the one or more computer memories including: 

3 a first code directed to inputting at least 10,000 items in electronic form into a 

4 clustering process, each of the items including information; 

5 a second code directed to inputting an initial organization structure in electronic 

6 form into the clustering process, the initial organization structure including one or more 

7 categories, at least one of the categories being associated with one of the items; 
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8 a third code directed to processing using at least processing hardware the pluraUty 

9 of items based upon at least the initial organization structure and the information in each of the 

10 items in at least the clustering process; 

1 1 a fourth code directed to determining a resulting organization structure based 

12 upon the processing, the resulting organization structure more closely resembling the initial 

13 organization structure than if an empty organization structure or an alternative initial 

14 organization structure had been input into the clustering process; and 

15 a fifth code directed to storing the resulting organization structure in the one or 

1 6 more memories or another memory. 

1 12. The device of claim 1 1 further comprising a sixth code directed to 

determining a likeness level between a first item and a second item, the likeness level between 

J;| the first item and the second item increases if they are both similar to items in one or more of the 

M categories in the initial organization structure. 

pl 13. The device of claims 12 wherein the sixth code directed to determining the 

# likeness level between the first item and the second item comprising: 

iS a code directed to associating a first feature vector with the first item and a second 

feature vector with the second item, each feature vector relating to information associated with 

4l the item; 

f S a code directed to extending the feature vector of each item with an additional 

7 feature representing a category of the initial organization structure, the additional feature relating 

8 to a degree to which the item is similar to one or more items in the corresponding category of the 

9 initial organization structure; and 

10 a code directed to calculating a measure of similarity of the first item with the 

1 1 second item including calculating the similarity measure between the first additional feature and 

12 the second additional feature. 

1 14. The device of claim 1 1 wherein the resulting organization structure 

2 includes a portion of the initial organization structure and at least one additional category 

3 coupled to the initial organization structure. 
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1 15. The device of claim 1 1 further comprising a sixth code directed to 

2 outputting the resulting organization structure, the resulting organization structure including a 

3 plurality of categories. 

1 16. The device of claim 15 further comprising a seventh code directed to 

2 inputting additional items using the resulting organization structure. 

1 17. The device of claim 1 1 further comprising a sixth code directed to 

2 independently modifying the resulting organization structure using a graphical user interface. 

1 18. The device of claim 1 7 wherein the independently modifying is provided 

, 2 by a user coupled to the graphical user interface. 
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